cell biology
CellVerse: Do Large Language Models Really Understand Cell Biology?
Zhang, Fan, Liu, Tianyu, Zhu, Zhihong, Wu, Hao, Wang, Haixin, Zhou, Donghao, Zheng, Yefeng, Wang, Kun, Wu, Xian, Heng, Pheng-Ann
Recent studies have demonstrated the feasibility of modeling single-cell data as natural languages and the potential of leveraging powerful large language models (LLMs) for understanding cell biology. However, a comprehensive evaluation of LLMs' performance on language-driven single-cell analysis tasks still remains unexplored. Motivated by this challenge, we introduce CellVerse, a unified language-centric question-answering benchmark that integrates four types of single-cell multi-omics data and encompasses three hierarchical levels of single-cell analysis tasks: cell type annotation (cell-level), drug response prediction (drug-level), and perturbation analysis (gene-level). Going beyond this, we systematically evaluate the performance across 14 open-source and closed-source LLMs ranging from 160M to 671B on CellVerse. Remarkably, the experimental results reveal: (1) Existing specialist models (C2S-Pythia) fail to make reasonable decisions across all sub-tasks within CellVerse, while generalist models such as Qwen, Llama, GPT, and DeepSeek family models exhibit preliminary understanding capabilities within the realm of cell biology. (2) The performance of current LLMs falls short of expectations and has substantial room for improvement. Notably, in the widely studied drug response prediction task, none of the evaluated LLMs demonstrate significant performance improvement over random guessing. CellVerse offers the first large-scale empirical demonstration that significant challenges still remain in applying LLMs to cell biology. By introducing CellVerse, we lay the foundation for advancing cell biology through natural languages and hope this paradigm could facilitate next-generation single-cell analysis.
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.93)
Deepcell Appoints New Head of Bioinformatics to Support Rapid Company Growth
Deepcell, a life science company pioneering AI-powered cell classification and isolation for cell biology and translational research, today announced the appointment of Kevin Jacobs as the Vice President of Bioinformatics. Jacobs will be responsible for the company's bioinformatics strategy, implementation and its integration with other areas and into the company's offerings. This appointment is the latest addition to Deepcell's rapidly expanding team of scientists, engineers and computer science experts. Deepcell had acquired $20 million in Series A funding last year. Currently, Deepcell is helping to advance precision medicine by combining advances in AI, cell classification and capture, and single-cell analysis to deliver novel insights through an unprecedented view of cell biology.
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.06)
- North America > United States > California > San Francisco County > San Francisco (0.06)
- Health & Medicine (0.99)
- Banking & Finance > Capital Markets (0.56)
An easy-to-use platform is a gateway to AI in microscopy
A new, freely available platform helps non-experts use artificial intelligence to analyse microscopy images. The platform has been developed at Åbo Akademi University in Finland and Instituto Gulbenkian de Ciência, Portugal, and will be of big help in research and diagnostics using modern day microscopes. Software using artificial intelligence, AI, is revolutionizing how microscopy images are analysed. For instance, AI can be used to detect features in images (i.e., tumours in biopsy samples) or improve the quality of images by removing unwanted noise. However, non-experts continue to find AI technologies difficult to use.
Phylodynamics for cell biologists
Advances in experimental approaches for single-cell analysis allow in situ sequencing, genomic barcoding, and mapping of cell lineages within tissues and organisms. Large amounts of data have thus accumulated and present an analytical challenge. Stadler et al. recognized the need for conceptual and computational approaches to fully exploit these technological advances for the understanding of normal and disease states. The authors review ideas taken from phylodynamics of infectious disease and show how similar tree-building techniques can be applied to monitoring changes in somatic cell lineages for applications ranging from development and differentiation to cancer biology. Science , this issue p. [eaah6266][1] ### BACKGROUND The birth, death, and diversification of individuals are events that drive biological processes across all scales. This is true whether the individuals in question represent nucleic acids, cells, whole organisms, populations, or species. The ancestral relationships of individuals can be visualized as branching trees or phylogenies, which are long-established representations in the fields of evolution, ecology, and epidemiology. Molecular phylogenetics is the discipline concerned with the reconstruction of such trees from gene or genome sequence data. The shape and size of such phylogenies depend on the past birth and death processes that generated them, and in phylodynamics, mathematical models are used to infer and quantify the dynamical behavior of biological populations from ancestral relationships. New technological advances in genetics and cell biology have led to a growing body of data about the molecular state and ancestry of individual cells in multicellular organisms. Ideas from phylogenetics and phylodynamics are being applied to these data to investigate many questions in tissue formation and tumorigenesis. ### ADVANCES Trees offer a valuable framework for tracing cell division and change through time, beginning with individual ancestral stem cells or fertilized eggs and resulting in complex tissues, tumors, or whole organisms (see the figure). They also provide the basis for computational and statistical methods with which to analyze data from cell biology. Our Review explains how “tree-thinking” and phylodynamics can be beneficial to the interpretation of empirical data pertaining to the individual cells of multicellular organisms. We summarize some recent research questions in developmental and cancer biology and briefly introduce the new technologies that allow us to observe the spatiotemporal histories of cell division and change. We provide an overview of the various and sometimes confusing ways in which graphical models, based on or represented by trees, have been applied in cell biology. To provide conceptual clarity, we outline four distinct graphical representations of the history of cell division and differentiation in multicellular organisms. We highlight that cells from an organism cannot be always treated as statistically independent observations but instead are often correlated because of phylogenetic history, and we explain how this can cause difficulties when attempting to infer dynamical behavior from experimental single-cell data. We introduce simple ecological null models for cell populations and illustrate some potential pitfalls in hypothesis testing and the need for quantitative phylodynamic models that explicitly incorporate the dependencies caused by shared ancestry. ### OUTLOOK We expect the rapid growth in the number of cell-level phylogenies to continue, a trend enhanced by ongoing technological advances in cell lineage tracing, genomic barcoding, and in situ sequencing. In particular, we anticipate the generation of exciting datasets that combine phenotypic measurements for individual cells (such as through transcriptome sequencing) with high-resolution reconstructions of the ancestry of the sampled cells. These developments will offer new ways to study developmental, oncogenic, and immunological processes but will require new and appropriate conceptual and computational tools. We discuss how models from phylogenetics and phylodynamics will benefit the interpretation of the data sets generated in the foreseeable future and will aid the development of statistical tests that exploit, and are robust to, cell shared ancestry. We hope that our discussion will initiate the integration of cell-level phylodynamic approaches into experimental and theoretical studies of development, cancer, and immunology. We sketch out some of the theoretical advances that will be required to analyze complex spatiotemporal cell dynamics and encourage explorations of these new directions. Powerful new statistical and computational tools are essential if we are to exploit fully the wealth of new experimental data being generated in cell biology. ![Figure][2] Multicellular organisms develop from a single fertilized egg. The division, apoptosis, and differentiation of cells can be displayed in a development tree, with the fertilized egg being the root of the tree. The development of any particular tissue within an organism can be traced as a subtree of the full developmental tree. Subtrees that represent cancer tumors or B cell clones may exhibit rapid growth and genetic change. Here, we illustrate the developmental tree of a human and expand the subtree representing haematopoiesis (blood formation) in the bone marrow. Stem cells in the bone marrow differentiate, giving rise to the numerous blood cell types in humans. The structure of the tree that underlies haematopoiesis and the formation of all tissues is unclear. Phylogenetic and phylodynamic tools can help to describe and statistically explore questions about this cell differentiation process. Multicellular organisms are composed of cells connected by ancestry and descent from progenitor cells. The dynamics of cell birth, death, and inheritance within an organism give rise to the fundamental processes of development, differentiation, and cancer. Technical advances in molecular biology now allow us to study cellular composition, ancestry, and evolution at the resolution of individual cells within an organism or tissue. Here, we take a phylogenetic and phylodynamic approach to single-cell biology. We explain how “tree thinking” is important to the interpretation of the growing body of cell-level data and how ecological null models can benefit statistical hypothesis testing. Experimental progress in cell biology should be accompanied by theoretical developments if we are to exploit fully the dynamical information in single-cell data. [1]: /lookup/doi/10.1126/science.aah6266 [2]: pending:yes
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.78)
- Health & Medicine > Therapeutic Area > Immunology (0.53)
Johnson & Johnson Post-doc federated and privacy-preserving machine learning Beerse, Belgium Informatics
Janssen Research & Development seeks to drive innovation and deliver transformational medicines for the treatment of diseases in six therapeutic areas: neuroscience, cardiovascular diseases and metabolism, infectious diseases, immunology, oncology and pulmonary hypertension. In these areas, Janssen aims to address and solve unmet medical needs through the development of small and large molecules, as well as vaccines. The Janssen campus in Beerse (Belgium) has a unique ecosystem covering the complete drug development life cycle, with all capabilities from basic science to market access on one campus. The integrated environment of our campus gives our people the chance to experience many different aspects of drug development throughout their career. It has a successful track record of over sixty years of drug discovery and development and is one of the most important innovation engines of the Janssen group worldwide.
- Information Technology > Artificial Intelligence > Machine Learning (0.79)
- Information Technology > Data Science > Data Mining > Big Data (0.45)
Researchers have read the mind of a "black box" AI using cell biology – Fanatical Futurist by International Keynote Speaker Matthew Griffin
The deep neural networks that power today's Artificial Intelligence (AI) systems work in mysterious ways. They're black boxes – a question goes in and an answer comes out the other side, and while we might not know exactly how a black box AI system works, importantly we know that it does work. Over the past year there have been a few attempts to try to read and analyse the minds of these black boxes, from companies like Nvidia who use visualisations, to MIT who tried to analyse the neural network's layers, to Columbia University who pitted AI's against each other. But, frankly, none of them even come close to the out the box thinking, if you'll excuse the pun, of this approach – using biology itself to crack the black box open. A new study that mapped a neural network to the biological components within a simple yeast cell allowed researchers to watch the AI system at work, and it also gave them insights into cell biology in the process, and the resulting technology could help in the quest for new cancer drugs and personalised treatments.
- Transportation > Air (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
Cracking Open the Black Box of AI with Cell Biology
The deep neural networks that power today's artificial intelligence systems work in mysterious ways. They're black boxes: A question goes in ("Is this a photo of a cat?" "What's the best next move in this game of Go?" "Should this self-driving car accelerate at this yellow light?"), and an answer comes out the other side. We may not know exactly how a black box AI system works, but we know that it does work. But a new study that mapped a neural network to the components within a simple yeast cell allowed researchers to watch the AI system at work. And it gave them insights into cell biology in the process.
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Leisure & Entertainment > Games (0.90)
- Transportation > Air (0.83)